Improve zero checks in sparse and reduce #9325

IainNZ · 2014-12-12T07:03:53Z

Just noticed some checks with 0 that don't play nice with custom types. There are probably more of these, but I'm not sure about a systematic way to find them.

This is only my second "real" commit I think, so not sure my test code is kosher style-wise.

tkelman · 2014-12-12T07:17:06Z

I guess constructing a zero of a type should be comparable in performance to comparing to an integer 0 literal? And there might be some custom types where the former makes sense but the latter doesn't? I've had types where I wanted to put them in sparse matrices and I had to say comparison to zero (or any number) is always false in order for things to work.

The appveyor failure is due to #9189 until we can get a new nightly built, not this PR's fault.

rfourquet · 2014-12-12T07:25:22Z

Comparing x!=0 seems much faster than x!=zero(x) for x::BigInt. Maybe put zero(x) outside of loops?

ivarne · 2014-12-12T12:01:12Z

@JeffBezanson has required that we use a != 0 rather than a!=zero(a) in many instances. If proper promotion rules are in place, it should work exactly the same with the fallbacks for Number.

When would it be wrong to implement ==(::MyType, Integer) and ==(::Integer, ::MyType) for this to work?

As @rfourquet points out: Some types , like BigInt and BigFloat are much more efficient to compare to 0 than to construct a zero() instance for comparison.

toivoh · 2014-12-12T12:31:17Z

For sparse arrays, I think that we should consider if we want them to work
with any element type, not just numbers, and what that would mean.

ivarne · 2014-12-12T12:59:54Z

If sparse arrays should work with other types than immutable Numbers, it seems like it really need a customizable definition of zeroness. Neither x == 0 nor x == zero(x), (with zero meaning additive identity when generalized to matrices), seems like a perfect definition.

StefanKarpinski · 2014-12-12T13:20:26Z

Hmm. This is a bit of a problem. Maybe we need iszero that is functionally equivalent to x == zero(x) but can be made much more efficient? For containers making x == 0 work but no other scalar comparisons seems like a bad idea.

ViralBShah · 2014-12-12T13:22:31Z

As noted above, the reason it uses 0 and not zero(T) is because of what @JeffBezanson had once suggested. That is certainly problematic for non-zero types, but much of the sparse matrix code hasn't been tested with anything but Number.

StefanKarpinski · 2014-12-12T13:24:30Z

The real issue is that you may want a zero matrix to cound and zeros(3,3) == 0 is a method error.

toivoh · 2014-12-12T13:43:47Z

iszero seems reasonable. Should it default to always false for non-number types? What should a sparse matrix return when you ask for an absent entry? Or should that be an error, and you would have to ask if there is something present first? Or should it work like some kind of NullableArray?

johnmyleswhite · 2014-12-12T13:59:21Z

I wouldn't be excited to treat sparsity as equivalent to missingness. In my mind, asking for an element of a container with eltype T should always return something of type T.

toivoh · 2014-12-12T14:17:49Z

Maybe it makes little sense to have sparse arrays for non-numbers that
conform to the same interface add with numbers, but it would make sense to
reuse the underlying implementation for something similar.

Or maybe the generalisation is just to supply a default value for the
absent elements. I also seems to remember there has been some talk about
non-zero default values for numeric sparse matrices.

jiahao · 2014-12-12T15:56:52Z

Note that neither zeros(3,3) == 0 nor zeros(3,3) == zero(Matrix) works, since the latter method is not defined. (zero(Matrix) doesn't know what dimensions it should have, since the Matrix type does not carry that information.) So this code would still fail to work for containers of containers, say SparseMatrixCSC{Matrix}.

As @ivarne said, the only such comparison that makes sense to write for matrices currently is M == zero(M).

The other pitfall one can run into is breaking transitivity. Imagine defining zeros(3,3) == 0 and zeros(4,4) == 0, and then erroneously concluding that zeros(3,3) == zeros(4,4).

IainNZ · 2014-12-12T16:20:21Z

Wow this was more controversial than I thought it'd be, given it was already done partially but not consistently

I guess constructing a zero of a type should be comparable in performance to comparing to an integer 0 literal?

It should be free for bitstypes, but I don't know what happens for other types (whether it allocates a fresh zero every time round the loop or it is identifies that it is an invariant)...

Comparing x!=0 seems much faster than x!=zero(x) for x::BigInt. Maybe put zero(x) outside of loops?

... But this suggests it is a problem. Putting it outside the loop is depressingly ugly, but still preferable to the old cold IMO

When would it be wrong to implement ==(::MyType, Integer) and ==(::Integer, ::MyType) for this to work?

When that comparison doesn't make sense, or already has another meaning. e.g. JuMP defines ==(AffExpr, Integer) to return a LinearConstraint.

For sparse arrays, I think that we should consider if we want them to work with any element type, not just numbers, and what that would mean.

I mean, with this PR we have (probable) complete support for custom types for anything that makes sense. I don't see much point in restricting it to <: Number because I can still make custom types that are Numbers (or that I claim are numbers) but don't have a well defined comparison with 0. If a Number must support comparison with 0, that shouts "we need interfaces!" because that is hella subtle.

Hmm. This is a bit of a problem. Maybe we need iszero that is functionally equivalent to x == zero(x) but can be made much more efficient? For containers making x == 0 work but no other scalar comparisons seems like a bad idea.

Maybe, but then you'll need to define Base.zero(::Type{MyType}), Base.zero(x::MyType), AND Base.iszero(x::MyType) - we are shifting the burden to the user and making the standard library lazier, which doesn't feel good to me at all.

As noted above, the reason it uses 0 and not zero(T) is because of what @JeffBezanson had once suggested. That is certainly problematic for non-zero types, but much of the sparse matrix code hasn't been tested with anything but Number.

It seems like it mostly works though - these things I fixed are mainly second-order stuff, and there is already chunks of the code that are designed to work with any type (mainly for type stability reasons though rather than a desire to be generic)

Or maybe the generalisation is just to supply a default value for the
absent elements. I also seems to remember there has been some talk about
non-zero default values for numeric sparse matrices.

I will always push back (as will others) to this idea as I've never seen it in the wild, and it adds even more conceptual overhead to the "structural zeros" situation we are already in.

Note that neither zeros(3,3) == 0 nor zeros(3,3) == zero(Matrix) works, since the latter method is not defined. (zero(Matrix) doesn't know what dimensions it should have, since the Matrix type does not carry that information.) So this code would still fail to work for containers of containers, say SparseMatrixCSC{Matrix}. As @ivarne said, the only such comparison that makes sense to write for matrices currently is M == zero(M).

Thats a good point! I could change the code and add that as test.

ivarne · 2014-12-12T16:46:16Z

@IainNZ: Maybe, but then you'll need to define Base.zero(::Type{MyType}), Base.zero(x::MyType), AND Base.iszero(x::MyType) - we are shifting the burden to the user and making the standard library lazier, which doesn't feel good to me at all.

I can't really see how the standard library can be fully generic without having a well defined set of functions that have one single meaning each. iszero would have a fallback implementation for Number and AbstractArray, and will allow strange structures to implement enough of the "spare interface" without having to define comparison to Integer

jiahao · 2014-12-12T16:47:06Z

The iszero question is yet another example of the general issue of whether X<:Any is by default a scalar unless X<:AbstractArray. Closely related:

the zero(::Number) issues Restrict zero and one to Number types. Default to floating points. #6183 and Typing of array operations #6173,
the zero(::Array{Array}) issue Add support for zero(::Array{Array}) #8759,
the analogous discussion for transpose which spans Make (c)transpose work correctly for block matrices #7244 and Remove (c)transpose definitions from operators.jl. Add them to string.jl #9170,
the analogous issue for maximum(::Array{Any}), maximum of Any array #9276.

tkelman · 2014-12-12T21:32:58Z

that shouts "we need interfaces!"

Doing sparse matrices generically really demands this. What we're doing now isn't great, we get into unclear quagmires like this. That reminds me that I have an email from @jiahao to respond to.

There are different levels of how you want to deal with sparse matrices, structurally and numerically. At the structural stage, if something is given a value, any value, you want it stored explicitly. At the numerical stage, sometimes you want to compare the values against zero and remove them - but that doesn't always make sense.

IainNZ · 2015-02-26T01:03:37Z

I'm going to close this in the interests of tidiness, and delete the branch. Here are the tests that (IMO) should pass but don't, for posterities sake. Maybe we can revisit in the future.

# Test proper handling of zeros for user types
immutable SpTestVal
    value::Float64
end
(==)(x::SpTestVal,y::SpTestVal) = (x.value == y.value)
Base.zero(x::SpTestVal) = SpTestVal(0)
Base.zero(::Type{SpTestVal}) = SpTestVal(0)
A = sparse([1,2,3],[1,2,3],[SpTestVal(1),SpTestVal(0),SpTestVal(3)])
@test nnz(A) == 2  # zeros should be stripped by sparse
A[2,2] = SpTestVal(0)
@test nnz(A) == 2  # zeros should be stripped by setindex
A = SparseMatrixCSC(3,3,[1,2,3,4],[1,2,3],
        [SpTestVal(1.0),SpTestVal(0.0),SpTestVal(3.0)])
@test countnz(A) == 2
r,c,v = findnz(A)
@test length(r) == length(c) == length(v) == 2

Improve zero checks in sparse and reduce

527c646

IainNZ closed this Feb 26, 2015

IainNZ added test This change adds or pertains to unit tests sparse Sparse arrays labels Feb 26, 2015

IainNZ deleted the ird/sparsezero branch February 26, 2015 01:03

IainNZ mentioned this pull request May 23, 2015

Fix sparse matrix nits, and make sparse matrix construction fail if zero(Tv) not defined #11408

Merged

IainNZ mentioned this pull request Oct 6, 2015

RFC: SparseVectors, Take 2 #13440

Merged

11 tasks

tkelman mentioned this pull request Nov 24, 2016

Use zero in istriu, istril methods #19399

Closed

KristofferC mentioned this pull request Nov 26, 2017

SparseMatrix and SparseArray don't consitently define zero elements and allowed element types #24790

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve zero checks in sparse and reduce #9325

Improve zero checks in sparse and reduce #9325

IainNZ commented Dec 12, 2014

tkelman commented Dec 12, 2014

rfourquet commented Dec 12, 2014

ivarne commented Dec 12, 2014

toivoh commented Dec 12, 2014

ivarne commented Dec 12, 2014

StefanKarpinski commented Dec 12, 2014

ViralBShah commented Dec 12, 2014

StefanKarpinski commented Dec 12, 2014

toivoh commented Dec 12, 2014

johnmyleswhite commented Dec 12, 2014

toivoh commented Dec 12, 2014

jiahao commented Dec 12, 2014

IainNZ commented Dec 12, 2014

ivarne commented Dec 12, 2014

jiahao commented Dec 12, 2014

tkelman commented Dec 12, 2014

IainNZ commented Feb 26, 2015

Improve zero checks in sparse and reduce #9325

Improve zero checks in sparse and reduce #9325

Conversation

IainNZ commented Dec 12, 2014

tkelman commented Dec 12, 2014

rfourquet commented Dec 12, 2014

ivarne commented Dec 12, 2014

toivoh commented Dec 12, 2014

ivarne commented Dec 12, 2014

StefanKarpinski commented Dec 12, 2014

ViralBShah commented Dec 12, 2014

StefanKarpinski commented Dec 12, 2014

toivoh commented Dec 12, 2014

johnmyleswhite commented Dec 12, 2014

toivoh commented Dec 12, 2014

jiahao commented Dec 12, 2014

IainNZ commented Dec 12, 2014

ivarne commented Dec 12, 2014

jiahao commented Dec 12, 2014

tkelman commented Dec 12, 2014

IainNZ commented Feb 26, 2015